Search CORE

14 research outputs found

RFreak-An R-package for evolutionary computation

Author: Nunkesser Robin
Publication venue
Publication date
Field of study

RFreak is an R package providing a framework for evolutionary computation. By enwrapping the functionality of an evolutionary algorithm kit written in Java, it offers an easy way to do evolutionary computation in R. In addition, application examples where an evolutionary approach is promising in computational statistics are included and described in this paper. The package is thus further supporting the use of evolutionary computation in computational statistics. --R,evolutionary algorithms,evolutionary computation,association study,robust regression

Research Papers in Economics

Evolutionary algorithms for robust methods

Author: Morell Oliver
Nunkesser Robin
Publication venue
Publication date
Field of study

A drawback of robust statistical techniques is the increased computational effort often needed compared to non robust methods. Robust estimators possessing the exact fit property, for example, are NP-hard to compute. This means thatunder the widely believed assumption that the computational complexity classes NP and P are not equalthere is no hope to compute exact solutions for large high dimensional data sets. To tackle this problem, search heuristics are used to compute NP-hard estimators in high dimensions. Here, an evolutionary algorithm that is applicable to different robust estimators is presented. Further, variants of this evolutionary algorithm for selected estimatorsmost prominently least trimmed squares and least median of squaresare introduced and shown to outperform existing popular search heuristics in difficult data situations. The results increase the applicability of robust methods and underline the usefulness of evolutionary computation for computational statistics. --Evolutionary algorithms,robust regression,least trimmed squares (LTS),least median of squares (LMS),least quantile of squares (LQS),least quartile difference (LQD)

Research Papers in Economics

Representation of graphs by OBDDs

Author: Nunkesser Robin
Woelfel Philipp
Publication venue: Elsevier B.V.
Publication date: 28/01/2009
Field of study

AbstractRecently, it has been shown in a series of works that the representation of graphs by Ordered Binary Decision Diagrams (OBDDs) often leads to good algorithmic behavior. However, the question for which graph classes an OBDD representation is advantageous, has not been investigated, yet. In this paper, the space requirements for the OBDD representation of certain graph classes, specifically cographs, several types of graphs with few P4s, unit interval graphs, interval graphs and bipartite graphs are investigated. Upper and lower bounds are proven for all these graph classes and it is shown that in most (but not all) cases a representation of the graphs by OBDDs is advantageous with respect to space requirements

Elsevier - Publisher Connector

Computing the Least Quartile Difference Estimator in the Plane

Author: Bernholt Thorsten
Nunkesser Robin
Schettlinger Karen
Publication venue
Publication date
Field of study

A common problem in linear regression is that largely aberrant values can strongly influence the results. The least quartile difference (LQD) regression estimator is highly robust, since it can resist up to almost 50% largely deviant data values without becoming extremely biased. Additionally, it shows good behavior on Gaussian data – in contrast to many other robust regression methods. However, the LQD is not widely used yet due to the high computational effort needed when using common algorithms, e.g. the subset algorithm of Rousseeuw and Leroy. For computing the LQD estimator for n data points in the plane, we propose a randomized algorithm with expected running time O(n2 log2 n) and an approximation algorithm with a running time of roughly O(n2 log n). It can be expected that the practical relevance of the LQD estimator will strongly increase thereby. --

Research Papers in Economics

Algorithms for regression and classification

Author: Nunkesser Robin
Publication venue
Publication date
Field of study

Regression and classification are statistical techniques that may be used to extract rules and patterns out of data sets. Analyzing the involved algorithms comprises interdisciplinary research that offers interesting problems for statisticians and computer scientists alike. The focus of this thesis is on robust regression and classification in genetic association studies. In the context of robust regression, new exact algorithms and results for robust online scale estimation with the estimators Qn and Sn and for robust linear regression in the plane with the estimator least quartile difference (LQD) are presented. Additionally, an evolutionary computation algorithm for robust regression with different estimators in higher dimensions is devised. These estimators include the widely used least median of squares (LMS) and least trimmed squares (LTS). For classification in genetic association studies, this thesis describes a Genetic Programming algorithm that outpeforms the standard approaches on the considered data sets. It is able to identify interesting genetic factors not found before in a data set on sporadic breast cancer and to handle larger data sets than the compared methods. In addition, it is extendible to further application fields

Eldorado - Ressourcen aus und für Lehre, Studium und Forschung

Detecting high-order interactions of single nucleotide polymorphisms using genetic programming

Author: Bernholt Thorsten
Ickstadt Katja
Nunkesser Robin
Schwender Holger
Wegener Ing
Publication venue
Publication date
Field of study

Motivation: Not individual single nucleotide polymorphisms (SNPs), but high-order interactions of SNPs are assumed to be responsible for complex diseases such as cancer. Therefore, one of the major goals of genetic association studies concerned with such genotype data is the identification of these high-order interactions. This search is additionally impeded by the fact that these interactions often are only explanatory for a relatively small subgroup of patients. Most of the feature selection methods proposed in the literature, unfortunately, fail at this task, since they can either only identify individual variables or interactions of a low order, or try to find rules that are explanatory for a high percentage of the observations. In this paper, we present a procedure based on genetic programming and multi-valued logic that enables the identification of high-order interactions of categorical variables such as SNPs. This method called GPAS (Genetic Programming for Association Studies) cannot only be used for feature selection, but can also be employed for discrimination. Results: In an application to the genotype data from the GENICA study, an association study concerned with sporadic breast cancer, GPAS is able to identify high-order interactions of SNPs leading to a considerably increased breast cancer risk for different subsets of patients that are not found by other feature selection methods. As an application to a subset of the HapMap data shows, GPAS is not restricted to association studies comprising several ten SNPs, but can also be employed to analyze whole-genome data. --

Research Papers in Economics

An evolutionary algorithm for robust regression

Author: Morell Oliver
Nunkesser Robin
Publication venue
Publication date
Field of study

A drawback of robust statistical techniques is the increased computational effort often needed as compared to non-robust methods. Particularly, robust estimators possessing the exact fit property are NP-hard to compute. This means that--under the widely believed assumption that the computational complexity classes NP and P are not equal--there is no hope to compute exact solutions for large high dimensional data sets. To tackle this problem, search heuristics are used to compute NP-hard estimators in high dimensions. A new evolutionary algorithm that is applicable to different robust estimators is presented. Further, variants of this evolutionary algorithm for selected estimators--most prominently least trimmed squares and least median of squares--are introduced and shown to outperform existing popular search heuristics in difficult data situations. The results increase the applicability of robust methods and underline the usefulness of evolutionary algorithms for computational statistics.Random search heuristics Robust regression Least trimmed squares (LTS) Least median of squares (LMS) Least quantile of squares (LQS) Least quartile difference (LQD)

Research Papers in Economics

Computing the least quartile difference estimator in the plane

Author: Bernholt Thorsten
Nunkesser Robin
Schettlinger Karen
Publication venue
Publication date
Field of study

Research Papers in Economics

K.: Computing the least quartile difference estimator in the plane

Author: Karen Schettlinger
Robin Nunkesser
Thorsten Bernholt
Publication venue
Publication date
Field of study

Abstract. A common problem in linear regression is that largely aberrant values can strongly influence the results. The least quartile difference (LQD) regression estimator is highly robust, since it can resist up to almost 50 % largely deviant data values without becoming extremely biased. Additionally, it shows good behavior on Gaussian data – in contrast to many other robust regression methods. However, the LQD is not widely used yet due to the high computational effort needed when using common algorithms, e.g. the subset algorithm of Rousseeuw and Leroy. For computing the LQD estimator for n data points in the plane, we propose a randomized algorithm with expected running time O(n 2 log 2 n) and an approximation algorithm with a running time of roughly O(n 2 log n). It can be expected that the practical relevance of the LQD estimator will strongly increase thereby.

CiteSeerX

Eldorado - Ressourcen aus und für Lehre, Studium und Forschung